Crowdsourcing the General Public for Large Scale Molecular Pathology Studies in Cancer
نویسندگان
چکیده
BACKGROUND Citizen science, scientific research conducted by non-specialists, has the potential to facilitate biomedical research using available large-scale data, however validating the results is challenging. The Cell Slider is a citizen science project that intends to share images from tumors with the general public, enabling them to score tumor markers independently through an internet-based interface. METHODS From October 2012 to June 2014, 98,293 Citizen Scientists accessed the Cell Slider web page and scored 180,172 sub-images derived from images of 12,326 tissue microarray cores labeled for estrogen receptor (ER). We evaluated the accuracy of Citizen Scientist's ER classification, and the association between ER status and prognosis by comparing their test performance against trained pathologists. FINDINGS The area under ROC curve was 0.95 (95% CI 0.94 to 0.96) for cancer cell identification and 0.97 (95% CI 0.96 to 0.97) for ER status. ER positive tumors scored by Citizen Scientists were associated with survival in a similar way to that scored by trained pathologists. Survival probability at 15 years were 0.78 (95% CI 0.76 to 0.80) for ER-positive and 0.72 (95% CI 0.68 to 0.77) for ER-negative tumors based on Citizen Scientists classification. Based on pathologist classification, survival probability was 0.79 (95% CI 0.77 to 0.81) for ER-positive and 0.71 (95% CI 0.67 to 0.74) for ER-negative tumors. The hazard ratio for death was 0.26 (95% CI 0.18 to 0.37) at diagnosis and became greater than one after 6.5 years of follow-up for ER scored by Citizen Scientists, and 0.24 (95% CI 0.18 to 0.33) at diagnosis increasing thereafter to one after 6.7 (95% CI 4.1 to 10.9) years of follow-up for ER scored by pathologists. INTERPRETATION Crowdsourcing of the general public to classify cancer pathology data for research is viable, engages the public and provides accurate ER data. Crowdsourced classification of research data may offer a valid solution to problems of throughput requiring human input.
منابع مشابه
Classification and Biomarker Genes Selection for Cancer Gene Expression Data Using Random Forest
Background & objective: Microarray and next generation sequencing (NGS) data are the important sources to find helpful molecular patterns. Also, the great number of gene expression data increases the challenge of how to identify the biomarkers associated with cancer. The random forest (RF) is used to effectively analyze the problems of large-p and smal...
متن کاملCrowdsourcing scoring of immunohistochemistry images: Evaluating Performance of the Crowd and an Automated Computational Method
The assessment of protein expression in immunohistochemistry (IHC) images provides important diagnostic, prognostic and predictive information for guiding cancer diagnosis and therapy. Manual scoring of IHC images represents a logistical challenge, as the process is labor intensive and time consuming. Since the last decade, computational methods have been developed to enable the application of ...
متن کاملPerform Three Data Mining Tasks with Crowdsourcing Process
For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...
متن کاملPrevalence of MPL (W515K/L) Mutations in Patients with Negative-JAK2 (V617F) Myeloproliferative Neoplasm in North-East of Iran
Background and Objective: Janus kinase 2 (JAK2) and Myeloproliferative Leukemia (MPL) mutations are confirmatory indicators for Myeloproliferative Neoplasm (MPN). The current study was performed to determine the frequency of MPL mutation in MPN patients without JAK2 mutation, in order to assign MPL mutation frequency in North-East of Iran.Methods: Total o...
متن کاملWhile Urine and Plasma Decorin Remain Unchanged in Prostate Cancer, Prostatic Tissue Decorin Has a Prognostic Value
Background: Numerous studies confirmed that significant decrease in tissue decorin (DCN) expression is associated to tumor progression and metastasis in certain types of cancer including prostate cancer (PC). However, the potential prognostic value of tissue DCN in PC has not yet been investigated. Methods: A total number of 40 PC and 42 patients with benign prostatic hyperplasia (BPH) were inv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2 شماره
صفحات -
تاریخ انتشار 2015